代码混合的文本数据包括带有来自多种语言的单词或短语的句子。全世界大多数多种语言社区都使用多种语言进行交流,而英语通常是其中之一。Hinglish是由印地语和英语组成的代码混合文本,但用罗马脚本编写。本文旨在确定影响系统生成的代码混合文本数据质量的因素。对于Hinglisheval任务,提出的模型使用多语言BERT来找到合成生成和人类生成的句子之间的相似性,以预测合成生成的hinglish句子的质量。
translated by 谷歌翻译
We consider a long-term average profit maximizing admission control problem in an M/M/1 queuing system with a known arrival rate but an unknown service rate. With a fixed reward collected upon service completion and a cost per unit of time enforced on customers waiting in the queue, a dispatcher decides upon arrivals whether to admit the arriving customer or not based on the full history of observations of the queue-length of the system. \cite[Econometrica]{Naor} showed that if all the parameters of the model are known, then it is optimal to use a static threshold policy - admit if the queue-length is less than a predetermined threshold and otherwise not. We propose a learning-based dispatching algorithm and characterize its regret with respect to optimal dispatch policies for the full information model of \cite{Naor}. We show that the algorithm achieves an $O(1)$ regret when all optimal thresholds with full information are non-zero, and achieves an $O(\ln^{3+\epsilon}(N))$ regret in the case that an optimal threshold with full information is $0$ (i.e., an optimal policy is to reject all arrivals), where $N$ is the number of arrivals and $\epsilon>0$.
translated by 谷歌翻译
Active target sensing is the task of discovering and classifying an unknown number of targets in an environment and is critical in search-and-rescue missions. This paper develops a deep reinforcement learning approach to plan informative trajectories that increase the likelihood for an uncrewed aerial vehicle (UAV) to discover missing targets. Our approach efficiently (1) explores the environment to discover new targets, (2) exploits its current belief of the target states and incorporates inaccurate sensor models for high-fidelity classification, and (3) generates dynamically feasible trajectories for an agile UAV by employing a motion primitive library. Extensive simulations on randomly generated environments show that our approach is more efficient in discovering and classifying targets than several other baselines. A unique characteristic of our approach, in contrast to heuristic informative path planning approaches, is that it is robust to varying amounts of deviations of the prior belief from the true target distribution, thereby alleviating the challenge of designing heuristics specific to the application conditions.
translated by 谷歌翻译
While generative models produce high-quality images of concepts learned from a large-scale database, a user often wishes to synthesize instantiations of their own concepts (for example, their family, pets, or items). Can we teach a model to quickly acquire a new concept, given a few examples? Furthermore, can we compose multiple new concepts together? We propose Custom Diffusion, an efficient method for augmenting existing text-to-image models. We find that only optimizing a few parameters in the text-to-image conditioning mechanism is sufficiently powerful to represent new concepts while enabling fast tuning (~6 minutes). Additionally, we can jointly train for multiple concepts or combine multiple fine-tuned models into one via closed-form constrained optimization. Our fine-tuned model generates variations of multiple, new concepts and seamlessly composes them with existing concepts in novel settings. Our method outperforms several baselines and concurrent works, regarding both qualitative and quantitative evaluations, while being memory and computationally efficient.
translated by 谷歌翻译
Microprocessor architects are increasingly resorting to domain-specific customization in the quest for high-performance and energy-efficiency. As the systems grow in complexity, fine-tuning architectural parameters across multiple sub-systems (e.g., datapath, memory blocks in different hierarchies, interconnects, compiler optimization, etc.) quickly results in a combinatorial explosion of design space. This makes domain-specific customization an extremely challenging task. Prior work explores using reinforcement learning (RL) and other optimization methods to automatically explore the large design space. However, these methods have traditionally relied on single-agent RL/ML formulations. It is unclear how scalable single-agent formulations are as we increase the complexity of the design space (e.g., full stack System-on-Chip design). Therefore, we propose an alternative formulation that leverages Multi-Agent RL (MARL) to tackle this problem. The key idea behind using MARL is an observation that parameters across different sub-systems are more or less independent, thus allowing a decentralized role assigned to each agent. We test this hypothesis by designing domain-specific DRAM memory controller for several workload traces. Our evaluation shows that the MARL formulation consistently outperforms single-agent RL baselines such as Proximal Policy Optimization and Soft Actor-Critic over different target objectives such as low power and latency. To this end, this work opens the pathway for new and promising research in MARL solutions for hardware architecture search.
translated by 谷歌翻译
Generalizability of time series forecasting models depends on the quality of model selection. Temporal cross validation (TCV) is a standard technique to perform model selection in forecasting tasks. TCV sequentially partitions the training time series into train and validation windows, and performs hyperparameter optmization (HPO) of the forecast model to select the model with the best validation performance. Model selection with TCV often leads to poor test performance when the test data distribution differs from that of the validation data. We propose a novel model selection method, H-Pro that exploits the data hierarchy often associated with a time series dataset. Generally, the aggregated data at the higher levels of the hierarchy show better predictability and more consistency compared to the bottom-level data which is more sparse and (sometimes) intermittent. H-Pro performs the HPO of the lowest-level student model based on the test proxy forecasts obtained from a set of teacher models at higher levels in the hierarchy. The consistency of the teachers' proxy forecasts help select better student models at the lowest-level. We perform extensive empirical studies on multiple datasets to validate the efficacy of the proposed method. H-Pro along with off-the-shelf forecasting models outperform existing state-of-the-art forecasting methods including the winning models of the M5 point-forecasting competition.
translated by 谷歌翻译
我们为多机器人任务计划和分配问题提出了一种新的公式,该公式结合了(a)任务之间的优先关系; (b)任务的协调,允许多个机器人提高效率; (c)通过形成机器人联盟的任务合作,而单独的机器人不能执行。在我们的公式中,任务图指定任务和任务之间的关系。我们在任务图的节点和边缘上定义了一组奖励函数。这些功能对机器人联盟规模对任务绩效的影响进行建模,并结合一个任务的性能对依赖任务的影响。最佳解决此问题是NP-HARD。但是,使用任务图公式使我们能够利用最小成本的网络流量方法有效地获得近似解决方案。此外,我们还探索了一种混合整数编程方法,该方法为问题的小实例提供了最佳的解决方案,但计算上很昂贵。我们还开发了一种贪婪的启发式算法作为基准。我们的建模和解决方案方法导致任务计划,即使在与许多代理商的大型任务中,也利用任务优先关系的关系以及机器人的协调和合作来实现高级任务绩效。
translated by 谷歌翻译
道路车辙是严重的道路障碍,可能导致早期和昂贵的维护成本的道路过早失败。在过去的几年中,正在积极进行使用图像处理技术和深度学习的道路损害检测研究。但是,这些研究主要集中在检测裂缝,坑洼及其变体上。很少有关于探测道路的研究。本文提出了一个新颖的道路车辙数据集,其中包括949张图像,并提供对象级别和像素级注释。部署了对象检测模型和语义分割模型,以检测所提出的数据集上的道路插道,并对模型预测进行了定量和定性分析,以评估模型性能并确定使用拟议方法检测道路插道时面临的挑战。对象检测模型Yolox-S实现了61.6%的Map@iou = 0.5,语义分割模型PSPNET(RESNET-50)达到54.69,精度为72.67,从而为将来的类似工作提供了基准的准确性。拟议的道路车辙数据集和我们的研究结果将有助于加速使用深度学习发现道路车辙的研究。
translated by 谷歌翻译
以前在外围防御游戏中的研究主要集中在完全可观察到的环境上,在该环境中,所有玩家都知道真正的玩家状态。但是,这对于实际实施而言是不现实的,因为捍卫者可能必须感知入侵者并估计其国家。在这项工作中,我们在照片真实的模拟器和现实世界中研究外围防御游戏,要求捍卫者从视力中估算入侵者状态。我们通过域随机化训练一个基于机器学习的系统,用于入侵者姿势检测,该系统汇总了多个视图,以减少状态估计错误并适应防御策略来解决此问题。我们新介绍性能指标来评估基于视觉的外围防御。通过广泛的实验,我们表明我们的方法改善了国家的估计,最终在两场比赛中的VS-1-Intruder游戏和2-Fefenders-VS-1-Intruder游戏中最终进行了外围防御性能。
translated by 谷歌翻译
这项研究提供了一个新颖的框架,以根据开源数据估算全球城市的公共交通巴士的经济,环境和社会价值。电动巴士是替代柴油巴士以获得环境和社会利益的引人注目的候选人。但是,评估总线电气化价值的最先进模型的适用性受到限制,因为它们需要可能难以购买的总线运营数据的细粒和定制数据。我们的估值工具使用通用过境饲料规范,这是全球运输机构使用的标准数据格式,为制定优先级排序策略提供了高级指导,以使总线机队电气化。我们开发了物理知识的机器学习模型,以评估每种运输途径的能耗,碳排放,健康影响以及总拥有成本。我们通过对大波士顿和米兰大都会地区的公交线路进行案例研究来证明我们的工具的可扩展性。
translated by 谷歌翻译